=begin pod :tag<perl6>

=TITLE Input/Output The Definitive Guide

=SUBTITLE Correctly use Perl 6 IO

=head1 The Basics

The vast majority of common IO work is done by the L<IO::Path> type. If you
want to read from or write to a file in some form or shape, this is the class
you want. It abstracts away the details of file handles (or "file descriptors")
and so you mostly don't even have to think about them.

Behind the scenes, L<IO::Path> works with L<IO::Handle>; a class which you
can use directly if you need a bit more control than what L<IO::Path>
provides. When working with other processes, e.g. via L<Proc> or L<Proc::Async>
types, you'll also be dealing with a I<subclass> of L<IO::Handle>: the
L<IO::Pipe>.

Lastly, you have the L<IO::CatHandle>, as well as L<IO::Spec> and its
subclasses, that you'll rarely, if ever, use directly. These classes give you
advanced features, such as operating on multiple files as one handle, or
low-level path manipulations.

Along with all these classes, Perl 6 provides several subroutines that
let you indirectly work with these classes. These come in handy if you like
functional programming style or in Perl 6 one liners.

While L<IO::Socket> and its subclasses also have to do with Input and Output,
this guide does not cover them.

=head1 Navigating Paths

=head2 What's an IO::Path Anyway?

To represent paths as either files or directories, use L<IO::Path> type.
The simplest way to obtain an object of that type is to coerce a L<Str> by
calling L«C<.IO>|/routine/IO» method on it:

    say 'my-file.txt'.IO; # OUTPUT: «"my-file.txt".IO␤»

It may seem like something is missing here—there is no volume or absolute
path involved—but that information is actually present in the object. You can
see it by using L«C<.perl>|/routine/perl» method:

    say 'my-file.txt'.IO.perl;
    # OUTPUT: «IO::Path.new("my-file.txt", :SPEC(IO::Spec::Unix), :CWD("/home/camelia"))␤»

The two extra attributes—C<SPEC> and C<CWD>—specify what type of operating
system semantics the path should use as well as the "current working directory"
for the path, i.e. if it's a relative path, then it's relative to that
directory.

This means that regardless of how you made one, an L<IO::Path> object
technically always refers to an absolute path. This is why its
L«C<.absolute>|/routine/absolute» and L«C<.relative>|/routine/relative»
methods return L<Str> objects and they are the correct way to stringify a path.

However, don't be in a rush to stringify anything. Pass paths around as
L<IO::Path|/type/IO::Path> objects. All the routines that operate on paths
can handle them, so there's no need to convert them.

=head2 Working with Files

=head3 Writing into files

=head4 Writing new content

Let's make some files and write and read data from them! The
L«C<spurt>|/routine/spurt» and L«C<slurp>|/routine/slurp» routines write and
read the data in one chunk. Unless you're working with very large files that
are difficult to store entirely in memory all at the same time, these two
routines are for you.

=for code
"my-file.txt".IO.spurt: "I ♥ Perl!";

The code above creates a file named C<my-file.txt> in the current directory
and then writes text C<I ♥ Perl!> into it. If Perl 6 is your first language,
celebrate your accomplishment! Try to open the file you created with some other
program to verify what you wrote with your program. If you already know
some other language, you may be wondering if this guide missed anything like
handling encoding or error conditions.

However, that is all the code you need. The string will be encoded in C<utf-8>
encoding by default and the errors are handled via the L<Failure> mechanism:
these are exceptions you can handle using regular conditionals. In this case,
we're letting all potential L<Failures|/type/Failure> get sunk after the
call and so any L<Exceptions|/type/Exception> they contain will be thrown.

=head4 Appending content

If you wanted to add more content to the file we made in previous section, you
could note the L«C<spurt> documentation|/routine/spurt» mentions C<:append>
argument. However, for finer control, let's get ourselves an L<IO::Handle>
to work with:

=for code
my $fh = 'my-file.txt'.IO.open: :a;
$fh.print: "I count: ";
$fh.print: "$_ " for ^10;
$fh.close;

The L«C<.open>|/routine/open» method call opens our L<IO::Path> and returns
an L<IO::Handle>. We passed C<:a> as argument, to indicate we want to open
the file for writing in append mode.

In the next two lines of code, we use the usual L«C<.print>|/routine/print»
method on that L<IO::Handle> to print a line with 11 pieces of text
(the C<'I count: '> string and 10 numbers). Note that, once again,
L<Failure> mechanism takes care of all the error checking for us. If the
L«C<.open>|/routine/open» fails, it returns a L<Failure>, which will throw
when we attempt to call method L«C<.print>|/routine/print» on it.

Finally, we close the L<IO::Handle> by calling the
L«C<.close>|/routine/close» method on it. It is
I<important that you do it>, especially in large programs or ones that deal
with a lot of files, as many systems have limits to how many files a program
can have open at the same time. If you don't close your handles, eventually
you'll reach that limit and the L«C<.open>|/routine/open» call will fail.
Note that unlike some other languages, Perl 6 does not use reference counting,
so the file handles B<are NOT closed> when the scope they're defined in is left.
They will be closed only when they're garbage collected and failing to close
the handles may cause your program to reach the file limit I<before> the open
handles get a chance to get garbage collected.

=head3 Reading from files

=head4 Using IO::Path

We've seen in previous sections that writing stuff to files is a single-line
of code in Perl 6. Reading from them, is similarly easy:

=for code
say 'my-file.txt'.IO.slurp;        # OUTPUT: «I ♥ Perl!␤»
say 'my-file.txt'.IO.slurp: :bin;  # OUTPUT: «Buf[uint8]:0x<49 20 e2 99 a5 20 50 65 72 6c 21>␤»

The L«C<.slurp>|/routine/slurp» method reads entire contents of the file
and returns them as a single L<Str> object, or as a L<Buf> object, if binary
mode was requested, by specifying C<:bin> named argument.

Since L«slurping|/routine/slurp» loads the entire file into memory, it's not
ideal for working with huge files.

The L<IO::Path> type offers two other handy methods:
L«C<.words>|/type/IO::Path#method_words» and
L«C<.lines>|/type/IO::Path#method_lines» that lazily read the file in smaller
chunks and return L<Seq> objects that (by default) don't keep already-consumed
values around.

Here's an example that finds lines in a text file that mention Perl and prints
them out. Despite the file itself being too large to fit into available
L<RAM|https://en.wikipedia.org/wiki/Random-access_memory>, the program will
not have any issues running, as the contents are processed in small chunks:

=for code
.say for '500-PetaByte-File.txt'.IO.lines.grep: *.contains: 'Perl';

Here's another example that prints the first 100 words from a file, without
loading it entirely:

=for code
.say for '500-PetaByte-File.txt'.IO.words: 100

Note that we did this by passing a limit argument to
L«C<.words>|/type/IO::Path#method_words» instead of, say, using
L<a list indexing operation|/language/operators#index-entry-array_indexing_operator-array_subscript_operator-array_indexing_operator>.
The reason for that is there's still a file handle in use under the hood, and
until you fully consume the returned L<Seq>, the handle will remain open.
If nothing references the L<Seq>, eventually the handle will get closed, during
a garbage collection run, but in large programs that work with a lot of files,
it's best to ensure all the handles get closed right away. So, you should
always ensure the L<Seq> from L<IO::Path>'s
L«C<.words>|/type/IO::Path#method_words» and
L«C<.lines>|/type/IO::Path#method_lines» methods is
L<fully reified|/language/glossary#index-entry-Reify>; and the limit argument
is there to help you with that.

=head4 Using IO::Handle

Of course, you can read from files using the L<IO::Handle> type, which gives you
a lot finer control over what you're doing:

=begin code
given 'some-file.txt'.IO.open {
    say .readchars: 8;  # OUTPUT: «I ♥ Perl␤»
    .seek: 1, SeekFromCurrent;
    say .readchars: 15;  # OUTPUT: «I ♥ Programming␤»
    .close
}
=end code

The L<IO::Handle> gives you
L«.read|/type/IO::Handle#method_read»,
L«.readchars|/type/IO::Handle#method_readchars»,
L«.get|/type/IO::Handle#routine_get»,
L«.getc|/type/IO::Handle#method_getc»,
L«.words|/type/IO::Handle#routine_words»,
L«.lines|/type/IO::Handle#routine_lines»,
L«.slurp|/type/IO::Handle#routine_slurp»,
L«.comb|/type/IO::Handle#method_comb»,
L«.split|/type/IO::Handle#method_split»,
and L«.Supply|/type/IO::Handle#method_Supply»
methods to read data from it. Plenty of
options; and the catch is you need to close the handle when you're done with it.

Unlike some languages, the handle won't get automatically closed when the
scope it's defined in is left. Instead, it'll remain open until it's garbage
collected. To make the closing business easier, some of the methods let you
specify a C<:close> argument, you can also use the
L«C<will leave> trait|/language/phasers#index-entry-will_trait», or the
C<does auto-close> trait provided by the
L«C<Trait::IO>|https://modules.perl6.org/dist/Trait::IO» module.

=head1 The Wrong Way To Do Things

This section describes how NOT to do Perl 6 IO.

=head2 Leave $*SPEC Alone

You may have heard of L«C<$*SPEC>|/language/variables#Dynamic_variables» and
seen some code or books show its usage for splitting and joining path fragments.
Some of the routine names it provides may even look familiar to what you've
used in other languages.

However, unless you're writing your own IO framework,
you almost never need to use L«C<$*SPEC>|/language/variables#Dynamic_variables»
directly. L«C<$*SPEC>|/language/variables#Dynamic_variables» provides low-level
stuff and its use will not only make your code tough to read, you'll likely
introduce security issues (e.g. null characters)!

The L«C<IO::Path>|/type/IO::Path» type is the workhorse of Perl 6 world. It
caters to all the path manipulation needs as well as provides shortcut routines
that let you avoid dealing with file handles. Use that instead of the
L«C<$*SPEC>|/language/variables#Dynamic_variables» stuff.

Tip: you can join path parts with C</> and feed them to
L«C<IO::Path>|/type/IO::Path»'s routines; they'll still do The Right Thing™ with
them, regardless of the operating system.

=for code :skip-test
# WRONG!! TOO MUCH WORK!
my $fh = open $*SPEC.catpath: '', 'foo/bar', $file;
my $data = $fh.slurp;
$fh.close;

=for code :skip-test
# RIGHT! Use IO::Path to do all the dirty work
my $data = 'foo/bar'.IO.add($file).slurp;

However, it's fine to use it for things not otherwise provided by L<IO::Path>.
For example, the L<C«.devnull» method|/routine/devnull>:

=for code :skip-test
{
    temp $*OUT = open :w, $*SPEC.devnull;
    say "In space no one can hear you scream!";
}
say "Hello";

=head2 Stringifying IO::Path

Don't use the C<.Str> method to stringify L«C<IO::Path>|/type/IO::Path» objects,
unless you just want to display them somewhere for information purposes or
something. The C<.Str> method returns whatever basic path string the
L«C<IO::Path>|/type/IO::Path» was instantiated with. It doesn't consider the
value of the L«C<$.CWD> attribute|/type/IO::Path#attribute_CWD». For example, this
code is broken:

=for code
# WRONG!! .Str DOES NOT USE $.CWD!
my $path = 'foo'.IO;
chdir 'bar';
run <tar -cvvf archive.tar>, ~$path;

The L«C<chdir>|/routine/chdir» call changed the value of the current directory,
but the C<$path> we created is relative to the directory before that change.

However, the L«C<IO::Path>|/type/IO::Path» object I<does> know what directory
it's relative to. We just need to use L«C<.absolute>|/routine/absolute» or
L«C<.relative>|/routine/relative» to stringify the object. Both routines return
a L«C<Str>|/type/Str» object; they only differ in whether the result is an
absolute or relative path. So, we can fix our code like this:

=for code
# RIGHT!! .absolute does consider the value of $.CWD!
my $path = 'foo'.IO;
chdir 'bar';
run <tar -cvvf archive.tar>, $path.absolute;
# Also good:
run <tar -cvvf archive.tar>, $path.relative;

=head2 Be mindful of $*CWD

While usually out of view, every L«C<IO::Path>|/type/IO::Path» object, by
default, uses the current value of
L«C<$*CWD>|/language/variables#Dynamic_variables» to set its
L«C<$.CWD> attribute|/type/IO::Path#attribute_CWD». This means there are two
things to pay attention to.

=head3 temp the $*CWD

This code is a mistake:

=for code
# WRONG!!
my $*CWD = "foo".IO;

The C<my $*CWD> made L«C<$*CWD>|/language/variables#Dynamic_variables»
undefined. The L«C<.IO>|/routine/IO» coercer
then goes ahead and sets the L«C<$.CWD> attribute|/type/IO::Path#attribute_CWD»
of the path it's creating to the stringified version of the undefined C<$*CWD>;
an empty string.

The correct way to perform this operation is use
L«C<temp>|/routine/temp» instead of C<my>. It'll localize the effect of changes
to L«C<$*CWD>|/language/variables#Dynamic_variables», just like C<my> would,
but it won't make it undefined, so the L«C<.IO>|/routine/IO» coercer will still
get the correct old value:

=for code
temp $*CWD = "foo".IO;

Better yet, if you want to perform some code in a localized
L«C<$*CWD>|/language/variables#Dynamic_variables», use the
L«C<indir> routine|/routine/indir» for that purpose.

=end pod

# vim: expandtab softtabstop=4 shiftwidth=4 ft=perl6
